home *** CD-ROM | disk | FTP | other *** search
-
- The "C1" Protocol
- -----------------
-
- The "C1" description is exceedingly long, so it has been broken up into 11
- more easily digestible sections. If you haven't read any of this before,
- please read them ALL, in order.
-
- Topic Section
- ----- -------
- Inception & Concepts ........... C1-1
- A Simple Conversation .......... C1-2
- Communication Codes & Checksums. C1-3
- Statement and Listen Loops ..... C1-4
- Synchonization Lock ............ C1-5
- Block Structure ................ C1-6
- Varying Block Size ............. C1-7
- Communication Syntax ........... C1-8
- Syntax Description ............. C1-9
- The "Endoff" Situation ......... C1-10
- Transfering File Type .......... C1-11
-
-
- Section C1-1
- ------------
-
- Inception
- ---------
- During the summer of 1981, when I first got the idea of putting up a BBS, I
- started work on a simple protocol for transfering programs to and from the
- BBS. This protocol was similar in structure to XMODEM, and had about the same
- reliability. Under good line conditions, it would give error free transfers
- (this was to be expected). Under moderate noise conditions, the protocol would
- hold up, and would still give error free transmissions. It was under poor line
- conditions that it, and XMODEM, would fall apart.
-
- In the summer of 1984, I started work on a very ambitious project; to produce
- a protocol that was both fast, and extremely reliable, even under the worst of
- line conditions. From this work came the "C1" protocol; not a simple
- block/checksum affair, but a complete communication system for the computer.
-
- Be warned, therefore, that understanding the ins and outs of "C1" will not be
- easy, but with enough patience, there's no reason why even the least skilled
- programmer cannot be comfortable with it.
-
- Concepts
- --------
- The concept behind the "C1" protocol was simple; to allow two computers to
- "talk" with one another (while transferring data) in such a way that nothing
- short of a complete distortion of the transmission line could result in a
- misunderstanding. If this concept could be realized, then files could be
- transferred between computers without fear of line noise causing a breakdown
- in the protocol, or that the received data would differ, in any way, from that
- which was sent. Nothing is perfect though, and I don't, for a minute, claim
- that "C1" is completely infallible, but I can say, with reasonable comfort,
- that "C1" can deliver bad line accuracy not found in any other microcomputer
- transfer protocols. For this accuracy though, there is a price to pay, and it
- is complexity; the protocol is extremely difficult to duplicate without a
- complete and utter understanding of the intricate workings of "C1". This
- document will attempt to give you that required understanding.
-
-
-
- Section C1-2
- ------------
-
- A Simple Conversation
- ---------------------
- In first deciding how the protocol would function, I thought of how two people
- could carry on a conversation under high noise conditions, where
- misunderstanding would be the norm. The senario I'm going to give differs from
- the protocol in that the people talking have no way of verifying the accuracy
- of what they believe they have heard. What it is meant to demonstrate is how
- the the two computers "talk" with one another, and discuss the neccessary
- repetition, or non-repetition, of each block of data (the cornerstone of a
- checksum based transfer protocol).
-
- Ken and John are attempting to assemble a machine in the middle of a very
- noisy machine shop. Ken reads the instructions to John, who carries them out.
- Even at close proximity, the two have difficulty hearing one another, so they
- adopt of form banter which allows each instuction to be verified and
- acknowledged. Here is how the conversation might go:
-
- John: Put part "A" in hole "D".
-
- Ken: Understood, putting part "A" in hole "D".
-
- John: Acknowledged, let me know when you are ready for the next instruction.
-
- Ken: Go ahead, what do I do next?
-
- John: Put screw "E" through slot "T".
-
- Ken: I didn't understand that, could you please repeat.
-
- John: Oh, ok, tell me when you're ready for that instruction again.
-
- Ken: Ready now.
-
- The conversation continues on in this fashion, guaranteeing that both John and
- Ken are fully aware of what the other is doing. In real life, people wouldn't
- have the patience to keep up that sort of banter, but that's why they make
- more mistakes than a computer. It is just this sort of "conversation" that the
- two computers have between each other, only the language is different; the
- instruction is replaced by the block of data, and all other statements by
- special codes.
-
-
-
-
- Section C1-3
- ------------
-
- Communication Codes
- -------------------
- One of the areas where simple protocols fall apart is in the transmission of
- "handshaking codes". It's called handshaking because is implies that the two
- computers are having a dialogue, rather than a monologue. These other
- protocols rely on single byte (8 bit) words for their communication codes, and
- that could spell trouble, since the likelihood of any one 8 bit code being
- transposed into another is greater than for multiple byte codes. For this
- reason, "C1" uses 3 byte (24 bit) codes which are sufficiently different that
- the likelihood of a transposition is extremely low. Not only that, but as you
- will soon learn, the method of receiving 3 byte codes is designed such that if
- there is sufficient line noise to make the neccessary transpositions, there
- would most likely be extra characters sent; "C1" can avoid this situation.
-
- Five distinct codes are used in the protocol; "GOO", "BAD", "ACK", "S/B", and
- "SYN". Each has it's own meaning, just like any English word, and all are used
- in a specific sequence such that synchronization difficulties would be
- automatically identified and corrected.
-
- Checksums
- ---------
- When a block of data is sent, we must have a way of determining if it is
- correctly received or not. This is accomplished by using what is known as a
- checksum. Quite simply, a checksum is a number which is mathematically derived
- from all the bytes within the block. The receiving computer recalculates the
- sum and compares it with the sum it received along with the block.
- Theoretically, any fault in the transmitted data will result in the two
- checksums not matching; but that's theory. In reality, the accuracy of the
- checksum is based on the type of mathematical operation used to calculate it,
- and what kind of noise it encounters.
-
- The simplest way to create a checksum is to add up all the ASCII values of the
- bytes contained in the block. This is fine for many types of errors, but not
- the type which inverts a particular bit. Should two identical inversions occur
- on two opposite bits, the sum will remain the same. For example, take the
- following two bytes:
-
- 11010011 = 211
- Plus 01101101 = 109
- -------- ---
- 320
-
- Now assume that the forth bit from the right of both of these bytes becomes
- inverted by line noise:
-
- 11011011 = 219
- Plus 01100101 = 101
- -------- ---
- 320
-
- As you can see, the sum remains 320, even though line noise has made obvious
- changes to the bytes. A better system is one called "Cyclic Redundancy", which
- works on a somewhat different principle. The checksum is 16 bits long, and is
- created in the following fashion; each byte from the block is Exclusive OR'ed
- with the low order part of the checksum. The checksum is then ROTATED one bit
- to the left, and the procedure repeated with the next byte. Even this highly
- superior method can be tripped up, so I have combined BOTH an additive
- checksum and Cyclic Redundancy checksum to create one very hard to beat 32 bit
- "super" checksum.
-
-
-
- Section C1-4
- ------------
-
- Listening For Code Words
- ------------------------
- Although 3 byte code words are more reliable than 1 byte code words, nothing
- is perfect. It was once said that if you let an infinite number of monkeys
- bash away at typewriters for an infinite amount of time, one of them would
- eventually type "To be or not to be, that is the question". Although this
- stretches statistical probability to it's limit, this kind of thing can easily
- happen on a smaller scale; the letters "GOO" could quite conceivably be
- produced by purely random line noise.
-
- To try and eliminate ALL possible errors isn't feasible, but "C1" makes an
- attempt at trying to eliminate as many as possible. One reasonably probable
- fact is that any noise capable of randomly producing "GOO", would not stop
- there; more likely, it would produce a string of characters, something like
- "HGOOEK". Were we to allow the protocol to listen exclusively for three letter
- combinations, it would most assuredly pick out the "GOO" in that string.
-
- My specifications for "C1" call for a code recognition routine which will ONLY
- make code word comparisons on the LAST 3 RECEIVED bytes. This is accomplished
- in my coding by going back and testing for further characters after I have
- identified a three byte code word. Should another byte be present, the
- identified code word is thrown away, and the search will continue.
-
- Statement and Listen Loops
- --------------------------
- One immediate drawback to the system described above is that a REAL code word,
- masked within some random noise, would be rejected by the receiving computer.
- This would also be true of a code word simply damaged by noise (like "GOE").
- For a protocol to be impervious to this sort of corruption, it must be capable
- of restating code words over and over until the receiving computer can
- understand, yet it must also have a way of knowing whether the receiving
- computer got the code word or not. This was a fact that eluded me when I wrote
- the original protocol. When we talk to other people, the cornerstone of
- understanding is recognition. If we ask "What do you think?", yet get no
- reply, we ask again. Only when we receive a reply from the person to whom we
- are talking do we continue on with our next statement. It would be pointless
- wasting our breath on someone who isn't listening.
-
- Within "C1", communication between computers is handled through a similar
- system which I call the "Statement and Listen Loop". It's quite simple really;
- when one computer has to "say" something to the other, it does so, then waits
- for a predetermined time for a known response. Should it fail to receive a
- response within that period of time, the code word is said again, and the
- computer listens for the reply. This continues until the required response is
- heard. The system is further enhanced by the fact that both computers are
- ALWAYS engaged in a "Statement and Listen Loop".
-
-
-
- Section C1-5
- ------------
-
- Synchronization Lock
- --------------------
- That rather ominous sounding title is actually rather simple; it refers to a
- condition whereby the "Statement and Listen Loops" of each computer become
- locked together. This is analogous to two people speaking at the same time,
- over and over, such that no effective communication takes place. In order to
- guarantee that the two computers never get into this state, the wait times of
- the loops are altered slightly.
-
- Assume that the fixed wait loop time was 0.5 seconds; this is called a "Short"
- wait. We also have a "Long" wait, which would be slightly longer, say 0.6
- seconds (actually, the delay within a "Statement and Listen Loop" is not
- particually critical, but should be somewhere in the neighbourhood of one half
- second). Each time the computer goes through an SLL, a counter would determine
- which type of wait to use; Long or Short. The sequence is broken into three;
- the transmitting computer will use a Long-Long-Short, while the receiving
- computer will use a Short-Short-Long.
-
-
-
- Section C1-6
- ------------
-
- Block Structure
- ---------------
- Each block of data contains somewhat more than just a collection of characters
- taken from disk, it also contains a "header". The header is 7 bytes long, and
- contains the following information:
-
- Byte 1: Low part of ADDITIVE checksum
- Byte 2: High part of ADDITIVE checksum
- Byte 3: Low part of CLC checksum
- Byte 4: High part of CLC checksum
- Byte 5: Size of NEXT block
- Byte 6: Low part of Block Number
- Byte 7: High part of Block Number
-
- As you remember from the section on "checksums", there are two distinctly
- different, 16 bit (2 byte) checksums. One is an additive checksum, composed of
- the mathematical sum of the ASCII values of all the DATA bytes (and bytes 5
- through 7 of the header). The other checksum is calculated using Cyclic (CLC)
- Redundancy (on the same bytes). These 32 checksum bits are placed in the first
- 4 bytes of the header.
-
- The 5th byte is the length of the NEXT block. This may seem odd to some, but
- consider the difficulties in sending the size of the current block in that
- self same block. You need to know the block size to calculate the checksum,
- but you can't know for sure that the block size is correct unless you have
- verified the checksum. We call this a Catch-22. By sending the size of any
- given block in the PREVIOUS block, the size is known for a fact BEFORE the
- checksum is calculated.
-
- In the 6th and 7th byte are the block number. This was added quite early on in
- the development of "C1" under the assumption that it would be necessary (as it
- is in XMODEM). As it turned out, "C1" uses a method of handshaking which makes
- this unnecessary. None the less, my specifications call for it's inclusion, as
- certain uses of the block number could be made. Also, the high order part of
- the block number (byte 7 of the header) is used to flag the last block.
-
-
-
- Section C1-7
- ------------
-
- Varying Block Size
- ------------------
- The reason that block size was included in the header was originally to allow
- the last block only to vary in size (one can never guarantee that the amount
- of data to be sent will divide nicely into a preset block size). It quickly
- dawned on me that "C1" was set up in such a way that ANY block size could be
- used for ANY block in the transmission. Varying block size has it's
- advantages; under reasonably clean line conditions, large blocks transmit the
- most data with the least handshaking (which is mildly time consuming). Smaller
- blocks are superior under bad noise conditions, since smaller blocks run a
- higher chance of making it through the noise unscathed; and should it still
- fail to make it, less time is required to repeat a smaller block.
-
- My current implementation of "C1" allows the user to pick a fixed block size
- between 40 and 255 bytes, but in other implementations, there is no reason why
- block size couldn't be varied DURING transmission to adapt to CHANGING line
- conditions.
-
- One final thing concerning block structure is how would one presume to know
- the size of the FIRST BLOCK if that is revealed only in the block that came
- before it (quite a paradox). "C1" requires that the first block contain ONLY a
- header, which would make that block 7 bytes long. This header would do little
- more than supply the receiving computer with the size of first REAL block.
- Accuracy of this first "dummy" block is guaranteed since it must still pass
- the checksum tests. You must make the block number for this dummy block "0".
-
-
-
-
- Section C1-8
- ------------
-
- Communication Syntax
- --------------------
- Now that you understand block structure, handshaking methods, and code word
- vocabulary, it comes time to find out how this all comes together. Most
- procotols have very simple handshaking between blocks which is easy to trip
- up, given sufficiently noisy conditions. Usually, the transmitting computer
- sends the block, then waits for a response from the receiving computer; either
- "good" or "bad". The transmitting computer then proceeds to send the next
- block (if "good") or resend the last block (if "bad"). This system falls apart
- the moment the transmitting computer receives a false indication of "good" or
- "bad" and goes on to transmit the wrong block (and whether the receiving
- computer likes it or not, it has to tackle with another block). Should things
- get out of sync, and the transmitting computer sends the next block when it
- should have sent the last one again, XMODEM attempts to make corrections by
- use of the block number encoded within each block.
-
- "C1" does nothing so crude; it's very communication syntax guarantees that
- neither computer will get out of phase with the other. Whereas XMODEM uses a
- single statement monologue between each block, "C1" uses a multiple part
- dialogue. This makes "C1" about 3% slower than XMODEM, but this small
- trade-off in speed for accuracy will be well worth it the first time you run
- into trouble with XMODEM.
-
- XMODEM commincations would look something like this:
-
- Xmit: Transmits Block
- Rec : "Good"
- Xmit: Transmits Next Block
- Rec : "Bad"
- Xmit: Transmits Same Block Again
-
- In "C1", the transmission would look something like this:
-
- Xmit: Transmits Block
- Rec : "Good"
- Xmit: Good block acknowledged
- Rec : Send next block for me
- Xmit: Transmits Next Block
- Rec : "Bad"
- Xmit: Bad block acknowledged
- Rec : Send that block again
- Xmit: Transmits Same Block Again
-
- In this type of transmission dialogue, neither computer can get out of sync,
- since should it receive the opposite response than it expects, it goes back to
- give the correct code word for the response it DID RECEIVE, thus regaining
- proper synchronization. Couple this with the "Statement and Listen Loops", and
- you can readily see than communication would be hard to break down.
-
-
-
-
- Section C1-9
- ------------
-
- Syntax Description
- ------------------
- The following diagram should give you an understanding of the flow of
- information between blocks:
-
- For a Good Block:
-
- Xmit: [Block] "ACK" [Next Block]
- Rec : "GOO" "S/B"
-
- For a Bad Block:
-
- Xmit: [Block] "ACK" [Same Block]
- Rec : "BAD" "S/B"
-
- Actually, the two are identical; the only difference is the substitution of
- either "GOO" or "BAD" as the response to the received block. Immediately after
- receiving the block, the receiving computer recalculats the checksum to
- determine validity of the data. In the meantime, the transmitting computer
- starts to wait for a "GOO" or "BAD" signal. Since it can "say" nothing until
- it receives one of these codes, it merely waits. That may sound suspiciously
- like a good place to "hang up" the protocol, but the receiving end is
- eventually going to finish receiving the block, either because it timed out
- waiting, or it finished collecting the correct number of bytes from the
- transmitting computer.
-
- At that time, the receiving computer sends the appropriate code word ("GOO" or
- "BAD") and begins to wait for an acknowledgement ("ACK"). If it doesn't
- receive the "ACK" in about one half second, it sends the "GOO" or "BAD" code
- word once again. Meanwhile, the transmitting computer has been patiently
- awaiting the reception of the "GOO" or "BAD" code. Once it receives it, it
- transmits an "ACK" and starts to wait for an "send block" signal ("S/B"). If
- it doesn't get the "S/B" within about one half second, it sends "ACK" again.
-
- Back at the receiving computer, which is waiting for this "ACK" signal, it
- receives it and sends the "S/B" signal and begins to wait for the block.
- Should it receive an "ACK" while waiting for the block, or receives nothing at
- all for approximately .5 seconds, it assumes that the transmitting computer
- hasn't heard the "S/B" and transmits it again. In the meantime, the
- transmitting computer is waiting for the "S/B", and upon reception, starts
- sending the block. The process has now started all over again.
-
- A quick analysis of this system will reveal that it's damned near impossible
- to get any type of noise which could possibly mimick the code sequences
- required. Also, no noise could stop the eventual completion of the above
- sequence, since each computer is aways "sending and waiting". If two people
- keep repeating their sentences over and over, and continue to listen to the
- other person, even a noisy room couldn't stop them from hearing one another
- EVENTUALLY. Of course, some line noise is just so horrendous, that even this
- method of communication could fail. Then again, this type of noise would make
- it damned near impossible for the user to be online in the first place, so it
- can be considered an unlikely event. But, should one of the computers go
- offline for any reason, we wouldn't want the other computer to keep looping
- and looping until it died of old age.
-
- Although I haven't built in such protection into the terminal program I
- distribute in the public domain, my BBS program does have abortion code.
- Should the protocol on the BBS have to go through the "Statement and Listen
- Loop" more than 24 times in row (which is hightly unlikely if the other
- computer is still online), it will abort the transfer. Similar code could be
- used in your implementation.
-
-
-
-
- Section C1-10
- -------------
-
- The End-Off Situation
- ---------------------
- When the final block is transmitted, the high order part of the block number
- should be made HEX "FF" (255 decimal). This will inform the receiving computer
- that this is the last block of data, and to expect no more. The question now
- arises; how can both computers be 100% sure that the other is fully aware of
- the file completion? A fair question, but not one with a simple answer.
-
- When the transmitting computer receives the "GOO" for the last block, it can
- be fairly certain that the receiving computer has received the final block,
- but it must inform the receiving computer that it knows this. It does so by
- sending an "ACK", but cannot be sure the receiving computer has received the
- "ACK" unless it gets the "S/B" signal back. Now, the transmitting computer
- must acknowledge the reception of the "S/B", but under the normal
- communications syntax, it would now have send a block. This is where the
- "End-Off" syntax comes into play; after receiving the "S/B", the transmitting
- computer sends back a "SYN" signal. In response to that receiving computer
- sends it's own "S/B" signal, then waits for the final "S/B" from the
- transmitting computer. Since it will not be responding to this code, it simply
- goes into a wait cycle for approximately 5 seconds.
-
- If it does get the "S/B" within that 5 seconds, it ends immediately, but
- otherwise doesn't really care if it receives the code or not since at this
- stage, there is a 100% assurance of both computers knowing things are Ok. The
- transmitting computer need only send three copies of the "S/B" code at this
- point, since, as stated above, there is full assurance that both computers are
- finished. NOTE that the code words chosen for the End-Off situation are not
- necessarily related to their appearant function.
-
-
-
- Section C1-11
- -------------
-
- Transfering File Type
- ---------------------
- When transfering files from one computer to another it is often necessary to
- also transfer the file type, but this must be known BEFORE the file is opened,
- and, therefore, before the protocol begins. "C1" does not impose any strict
- rules on what sort of information you transfer about the files, if any, but
- when writing a terminal program to communicate with one of my bulletin boards,
- the following should be done:
-
- Using a full implementation of the "C1" procotol (first dummy block, data
- block, and End-Off), transmit a single byte of data corresponding to the
- following file types:
-
- 1 = Program File
- 2 = SEQ File
- 3 = WordPro File
-
- Transmitting this single piece of data would require that TWO blocks be sent;
- the initial dummy block to set up the size of the first data block (of which
- there will be only one, size 8), and the data block itself, consisting of 7
- header bytes and the single file type byte. For other applications, one could
- conceivable transfer much more information, including file name, file type,
- computer type, etc. It could even be possible to transfer multiple files,
- specifying the number and name of each file in this first transmission.
- Alternately, no one said you HAVE to use this first separate transmission; if
- no information other the file needs to be transmitted, you just send the file
- and nothing more.
-